AITopics

2502.10303

Country:

Africa > Middle East > Egypt (0.15)
North America > United States (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games > Go (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-12-2025

Inference Scaling Reshapes AI Governance

Ord, Toby

The shift from scaling up the pre - training compute of AI systems to scaling up the ir inference compute may have profound effects on AI governance. The nature of these effects depends crucially on whether this new inference compute will primarily be used during external deployment or as part of a more complex training programme within the lab. R apid scaling of inference - at - deployment would: lower the importance of open - weight models (and of securing the weights of closed models), reduce the impact of the first human - level models, change the business model for frontier AI, reduce the need for power - intense data centres, and derail the current paradigm of AI governance via training compute thresholds. R apid scaling of inference - during - training would have more ambiguous effects that range from a revitalisation of pre - training scaling to a form of recursive self - improvement via iterated distillation and amplification . The intense year - on - year scaling up of AI training runs has been one of the most dramatic and stable markers of the Large Language Model era . Indeed it had been widely taken to be a permanent fixture of the AI landscape and the basis of many approaches to AI governance. But recent reports from unnamed employees at the leading labs suggest that their attempts to scale up pre - training substantially beyond the size of GPT - 4 have led to only modest gains which are insufficient to justify continuing such scaling and perhaps even insufficient to warrant public deployment of th o se models ( Hu & Tong, 2024) . A possible reason is that they are running out of high - quality training data. While the scaling laws might still be operating (given sufficient compute and data, the models would keep improving), the ability to harness them through rapid scaling of pre - training may not.

compute, deployment, inference, (15 more...)

2503.05705

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceDec-18-2022, 19:55:13 GMT

Monte Carlo Tree Search (MCTS) in AlphaGo Zero

In a Go game, AlphaGo Zero uses MC Tree Search to build a local policy to sample the next move. MCTS searches for possible moves and records the results in a search tree. As more searches are performed, the tree grows larger as well as its information. To make a move in Alpha-Go Zero, 1,600 searches will be computed. Then a local policy is constructed.

exploration, node, search tree, (12 more...)

Industry: Leisure & Entertainment > Games > Go (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

#artificialintelligenceOct-4-2022, 06:45:34 GMT

Top Real World Applications of Reinforcement Learning in 2022

Reinforcement Learning is a subfield of Machine Learning in which an agent explores an environment to learn how to perform specific tasks by taking actions with a good outcome and avoiding those with a bad one. A reinforcement learning model will learn from its experiences and will identify which actions lead to the best rewards. In reinforcement learning, the agent takes action based on the state of the environment, and the environment will return the reward and the next state. The agent employs a trial and error method to learn. It initially takes random actions and identifies which actions lead to long-term rewards over time.

artificial intelligence, machine learning, reinforcement learning, (9 more...)

Country: Asia > India > NCT > New Delhi (0.05)

Industry:

Health & Medicine (0.52)
Transportation > Ground > Road (0.34)
Leisure & Entertainment > Games (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceJan-31-2022

Leela Zero Score: a Study of a Score-based AlphaGo Zero

Pasqualini, Luca, Parton, Maurizio, Morandin, Francesco, Amato, Gianluca, Gini, Rosa, Metta, Carlo

AlphaGo, AlphaGo Zero, and all of their derivatives can play with superhuman strength because they are able to predict the win-lose outcome with great accuracy. However, Go as a game is decided by a final score difference, and in final positions AlphaGo plays suboptimal moves: this is not surprising, since AlphaGo is completely unaware of the final score difference, all winning final positions being equivalent from the winrate perspective. This can be an issue, for instance when trying to learn the "best" move or to play with an initial handicap. Moreover, there is the theoretical quest of the "perfect game", that is, the minimax solution. Thus, a natural question arises: is it possible to train a successful Reinforcement Learning agent to predict score differences instead of winrates? No empirical or theoretical evidence can be found in the literature to support the folklore statement that "this does not work". In this paper we present Leela Zero Score, a software designed to support or disprove the "does not work" statement. Leela Zero Score is designed on the open-source solution known as Leela Zero, and is trained on a 9x9 board to predict score differences instead of winrates. We find that the training produces a rational player, and we analyze its style against a strong amateur human player, to find that it is prone to some mistakes when the outcome is close. We compare its strength against SAI, an AlphaGo Zero-like software working on the 9x9 board, and find that the training of Leela Zero Score has reached a premature convergence to a player weaker than SAI.

leela zero score, lz network, score difference, (14 more...)

2201.13176

Country:

Europe > Italy (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.04)
Europe > Hungary > Budapest > Budapest (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Go (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Games > Go (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

arXiv.org Artificial IntelligenceNov-15-2021

AI in Games: Techniques, Challenges and Opportunities

Yin, Qiyue, Yang, Jun, Ni, Wancheng, Liang, Bin, Huang, Kaiqi

With breakthrough of AlphaGo, AI in human-computer game has become a very hot topic attracting researchers all around the world, which usually serves as an effective standard for testing artificial intelligence. Various game AI systems (AIs) have been developed such as Libratus, OpenAI Five and AlphaStar, beating professional human players. In this paper, we survey recent successful game AIs, covering board game AIs, card game AIs, first-person shooting game AIs and real time strategy game AIs. Through this survey, we 1) compare the main difficulties among different kinds of games for the intelligent decision making field ; 2) illustrate the mainstream frameworks and techniques for developing professional level AIs; 3) raise the challenges or drawbacks in the current AIs for intelligent decision making; and 4) try to propose future trends in the games and intelligent decision making techniques. Finally, we hope this brief review can provide an introduction for beginners, inspire insights for researchers in the filed of AI in games.

agent, learning, reinforcement learning, (16 more...)

2111.07631

Country:

North America > Canada > Alberta (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Texas (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Leisure & Entertainment > Games > Chess (0.93)
Leisure & Entertainment > Games > Go (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

#artificialintelligenceJun-29-2021, 04:40:06 GMT

Why AI Chess Champs Are Not Taking Over the World

At one time, the AI that beat humans at chess calculated strategies by studying the outcomes of human moves. In October 2017, the DeepMind team published details of a new Go-playing system, AlphaGo Zero, that studied no human games at all. Instead, it started with the game's rules and played against itself. The first moves it made were completely random. After each game, it folded in new knowledge of what led to a win and what didn't.

ai chess champ, alphago zero, information, (3 more...)

Industry:

Leisure & Entertainment > Games > Chess (0.65)
Leisure & Entertainment > Games > Go (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)
Information Technology > Artificial Intelligence > Robots (0.52)

#artificialintelligenceJun-21-2021, 07:05:27 GMT

A Quick Primer on Self-Play in Deep Reinforcement Learning

"Train tirelessly to defeat the greatest enemy, yourself, and to discover the greatest master, yourself" DeepMind has created AI that will crush any human player in Go, Chess, Shogi, and Starcraft 2. OpenAI has made similar strides in complex strategy games, notably in Dota 2. The agents in these games all achieved mastery using deep reinforcement learning. Yet, this is only part of the story. What was the magic sauce that sent these systems' playing ability out of the atmosphere? A simple framework called self-play, where your opponent is yourself. Self-play is a framework where an agent learns to play a game by playing against itself.

agent, alphago zero, problem problem, (11 more...)

Industry: Leisure & Entertainment > Games > Go (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.59)

#artificialintelligenceMay-17-2021, 23:43:00 GMT

Artificial Intelligence

Learn to write programs using the foundational AI algorithms powering everything from NASA's Mars Rover to DeepMind's AlphaGo Zero. Learn to write AI programs using the algorithms powering everything from NASA's Mars Rover to DeepMind's AlphaGo Zero.

alphago zero, artificial intelligence, mars rover, (3 more...)

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.78)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

Daily Mail - Science & techApr-29-2021, 22:33:12 GMT

Space Force scientist says it's 'imperative' military uses human augmentation by employing AI agents

Combining humans with machines to create superhuman intelligence may soon no longer be the plot of science-fiction films, as the US Space Force's chief scientist say it will happen in'the coming decade.' Dr. Joel Mozer, speaking at an event at the Airforce Research Laboratory Wednesday, announced we are entering the age of'human augmentation,' which is crucial to the US's national defense in order to not'fall behind our strategic competitors.' However, his proposal does not turn humans into cyborgs, but employs'AI agents' to assist with strategic military planning. Mozer highlights the abilities seen in developed by a Google subsidiary, AlphaGo Zero, which was able to train itself to play the game of Go at a master level in just a few weeks. Mozer suggests the extortionary capabilities can lead to superhuman capabilities, by means of combining human ingenuity with the power, speed and efficiency of machines.

human augmentation, space force scientist, superhuman intelligence, (9 more...)

Daily Mail - Science & tech

Genre: Research Report > New Finding (0.61)

Industry:

Government > Military (1.00)
Leisure & Entertainment > Games > Go (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.74)
Information Technology > Artificial Intelligence > Machine Learning (0.50)